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PROTEIN-PROTEIN INTERACTIONS 

INVOLVING TRANSFORMING GROWTH FACTOR P SIGNALING OR INVOLVING 

TRANSDUCTION SIGNALS OF TRANSFORMING FACTOR P FAMILY MEMBERS 



The present application claims priority to US provisional applications No. 60/333,348 
filed on November 26, 2001, No. 60/384,537 filed on May 31, 2002 and No. 60/422,471 filed 
on October 30, 2002. 

BACKGROUND AND PRIOR ART 

Most biological processes involve specific protein-protein interactions. Protein-protein 
interactions enable two or more proteins to associate. A large number of non-covalent bonds 
form between the proteins when two protein surfaces are precisely matched. These bonds 
account for the specificity of recognition. Thus, protein-protein interactions are involved, for 
example, in the assembly of enzyme subunits, in antibody-antigen recognition, in the 
formation of biochemical complexes, in the correct folding of proteins, in the metabolism of 
proteins, in the transport of proteins, in the localization of proteins, in protein turnover, in first 
translation modifications, in the core structures of viruses and in signal transduction. 

General methodologies to identify interacting proteins or to study these interactions 
have been developed. Among these methods are the two-hybrid system originally 
developed by Fields and co-workers and described, for example, in U.S. Patent Nos. 
5,283,173, 5,468,614 and 5,667,973, which are hereby incorporated by reference. 

The earliest and simplest two-hybrid system, which acted as basis for development of 
other versions, is an in vivo assay between two specifically constructed proteins. The first 
protein, known in the art as the "bait protein" is a chimeric protein which binds to a site on 
DNA upstream of a reporter gene by means of a DNA-binding domain or BD. Commonly, 
the binding domain is the DNA-binding domain from either Gal4 or native £ coli LexA and 
the sites placed upstream of the reporter are Gal4 binding sites or LexA operators, 
respectively. 

The second protein is also a chimeric protein known as the "prey" in the art. This 
second chimeric protein carries an activation domain or AD. This activation domain is 
typically derived from Gal4 F from VP16 or from B42. 

Besides the two-hybrid systems, other improved systems have been developed to 
detected protein-protein interactions. For example, a two-hybrid plus one system was 
developed that allows the use of two proteins as bait to screen available cDNA libraries to 
detect a third partner. This method permits the detection between proteins that are part of a 
larger protein complex such as the RNA polymerase II holoenzyme and the TFIIH or TFIID 
complexes. Therefore, this method, in general, permits the detection of ternary complex 
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formation as well as inhibitors preventing the interaction between the two previously defined 
fused proteins. 

Another advantage of the two-hybrid plus one system is that it allows or prevents the 
formation of the transcriptional activator since the third partner can be expressed from a 
5 conditional promoter such as the methionine-repressed Met25 promoter which is positively 
regulated in medium lacking methionine. The presence of the methionine-regulated 
promoter provides an excellent control to evaluate the activation or inhibition properties of the 
third partner due to its "on" and "off 1 switch for the formation of the transcriptional activator. 
The three-hybrid method is described, for example in Tirade et a/., The Journal of Biological 

10 Chemistry, 272, No. 37 pp. 22995-22999 (1997) incorporated herein by reference. 

Besides the two and two-hybrid plus one systems, yet another variant is that described 
in Vidal et al, Proc. Nail. ScL 93 pgs. 10315-10320 called the reverse two- and one-hybrid 
systems where a collection of molecules can be screened that inhibit a specific protein- 
protein or protein-DNA interactions, respectively. 

15 A summary of the available methodologies for detecting protein-protein interactions is 

described in Vidal and Legrain, Nucleic Acids Research Vol. 27, No. 4 pgs. 919-929 (1999) 
and Legrain and Selig, FEBS Letters 480 pgs. 32-36 (2000) which references are 
incorporated herein by reference. 

However, the above conventionally used approaches and especially the commonly 

20 used two-hybrid methods have their drawbacks. For example, it is known in the art that, 
more often than not, false positives and false negatives exist in the screening method. In 
fact, a doctrine has been developed in this field for interpreting the results and in common 
practice an additional technique such as co-immunoprecipitation or gradient sedimentation of 
the putative interactors from the appropriate cell or tissue type are generally performed. The 

25 methods used for interpreting the results are described by Brent and Finley, Jr. in Ann. Rev. 
Genet, 31 pgs. 663-704 (1997). Thus, the data interpretation is very questionable using the 
conventional systems. 

One method to overcome the difficulties encountered with the methods in the prior art 
is described in W099M2612, incorporated herein by reference. This method is similar to the 

30 two-hybrid system described in the prior art in that it also uses bait and prey polypeptides. 
However, the difference with this method is that a step of mating at least one first haploid 
recombinant yeast cell containing the prey polypeptide to be assayed with a second haploid 
recombinant yeast cell containing the bait polynucleotide is performed. Of course the person 
skilled in the art would appreciate that either the first recombinant yeast cell or the second 

35 recombinant yeast cell also contains at least one detectable reporter gene that is activated 
by a polypeptide including a transcriptional activation domain. 
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The method described in W099/42612 permits the screening of more prey 
polynucleotides with a given bait polynucleotide in a single step than in the prior art systems 
due to the cell to ceil mating strategy between haploid yeast cells. Furthermore, this method 
is more thorough and reproducible, as well as sensitive. Thus, the presence of false 
5 negatives and/or false positives is extremely minimal as compared to the conventional prior 
art methods. 

Transforming growth factor p (TGFp) belongs to a super-family of cytokines, including 
TGFpi, TGFp2, TGFp3, activins and Bone Morphologenetic Proteins (hereinafter BMP), 
which are synthesized by many cell types and have a variety of cellular and biological 

10 effects, including control of proliferation, differentiation, migration, angiogenesis, immunity 
and regulation of the turnover of the extracellular matrix. A number of disease states are 
known to be associated with variations in expression of genes which are controlled by TGFp 
and related .cytokines, including fibrotic disorders, abnormal wound healing, abnormal bone 
formation, cancer and tumor development, neurologic disorders, haematopoiesis and 

15 immune and inflammatory disorders. 

Signaling by this family of cytokines is transduced by heteromeric complexes of 
transmembrane Ser/Thr kinase receptors. Upon ligand binding, type II receptor 
phosphorylates and activates type I receptor which then propagates signals to downstream 
targets, in particular the Smad proteins. 

20 Ten mammalian Smad proteins have been identified and divided into three classes. 

The first includes pathway-restricted proteins such as Smadl, SrnadS and Smad8 which are 
specifically involved in BMP signaling and Smad2 and Smad3 which are restricted to 
TGFp/activin pathway. The second class contains the common-mediator Smad4 implicated 
in both BMP and TGFp/activin pathways. The third class contains the inhibitory Smads, 
. 25 Smad6 and Smad7. At least Smad2 and Smad3 are retained in the cytoplasm by binding to 
the SARA protein. After phosphorylation by TGFp-activated type I receptor on their carboxy- 
terminal SSXS sequence, pathway-restricted Smads form heteromeric complexes with 
Smad4 and then translocate to the nucleus where they control expression of diverse genes 
involved in various biological processes such as control of cellular proliferation and 

30 differentiation, regulation of the immune system and regulation of the extracellular matrix 
formation. 

Several proteins such as TGIF, Ski, SnoN, SNIP1 and CBP have been identified as 
Smad transcriptional co-regulators and shown to modulate the transcriptional ability of Smad 
proteins by direct interactions. Finally, proteins such Smurfl and Smurf2 are involved in 
35 degradation of Smad proteins by the proteasome machinery. 

Most biological processes involve specific protein-protein interactions. Protein-protein 
interactions enable two or more proteins to associate. A large number of non-covalent bonds 
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form between the proteins when two protein surfaces are precisely matched. These bonds 
account for the specificity of recognition. Thus, protein-protein interactions are involved, for 
example, in the assembly of enzyme subunits, in antibody-antigen recognition, in the 
formation of biochemical complexes, in the correct folding of proteins, in the metabolism of 
5 proteins, in the transport of proteins, in the localization of proteins, in protein turnover, in first 
translation modifications, in the core structures of viruses and in signal transduction. 

Several members of the TGFp/BMP pathways (SARA, Smurfl, Smurf2, Smadl, 
Smad2/hMAD2, Smad3/hMAD-3, Smad4, Smad5/MADH5, Smad7, Smad9/MADH6, SNIP1, 
SnoN) have been used as baits in yeast-two hybrid screening experiments. Several proteins 
10 have been identified as interactors with thoses baits (Figure 10). It was showed here 
functional data in mammalian cells that validate that those interactants are proteins involved 
in TGFp/BMP signaling. 

Thus, there is the still a need to explore all mechanisms relating to transforming growth 
factor p protein and to identify drug targets for fibrotic disorders, abnormal wound healing, 
15 abnormal bone formation, cancer and tumor development, neurologic disorders, 
haematopoiesis and immune and inflammatory disorders and/or diseases. 

SUMMARY OF THE PRESENT INVENTION 
Thus, it is an aspect of the present invention to identify protein-protein interactions 
involving proteins of the transforming growth factor p super-family of cytokines transduction 
20 pathway and to identify drug targets for fibrotic disorders, abnormal wound healing, abnormal 
bone formation, cancer and tumor development, neurologic disorders, haematopoiesis and 
immune and inflammatory disorders and/or disease. 

It is another aspect of the present invention to identify protein-protein interactions 
involved in transforming growth factor p-mediated disorders and/or diseases for the 
25 development of more effective and better targeted therapeutic treatments. 

It is yet another aspect of the present invention to identify complexes of polypeptides or 
polynucleotides encoding the polypeptides and fragments of the polypeptides of the 
transforming growth factor p super-family of cytokines transduction pathway. 

It is yet another aspect of the present invention to identify antibodies to these 
30 complexes of polypeptides or polynucleotides encoding the polypeptides and fragments of 
the polypeptides involving transforming growth factor p signaling including polyclonal, as well 
as monoclonal antibodies that are used for detection. 

It is still another aspect of the present invention to identify selected interacting domains 
of the polypeptides, called SID® polypeptides. 
35 It is still another aspect of the present invention to identify selected interacting domains 

of the polynucleotides, called SID® polynucleotides. 
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It is still another aspect of the present invention to provide a diagnostic kit to test for 
deficiencies in the transforming growth factor p super-family of cytokines transduction 
pathway. 

It is another aspect of the present invention to identify interacting proteins in the 
5 transforming growth factor p super-family of cytokines transduction pathway that can be used 
in pharmaceutical compositions or for diagnostic purposes. 

It is another aspect of the present invention to generate protein-protein interactions 
maps called PIM®s. 

It is yet another aspect of the present invention to provide a method for screening 
10 drugs for agents which modulate the interaction of proteins and pharmaceutical compositions 
that are capable of modulating the protein-protein interactions involved in transforming 
growth factor p disorders and/or diseases. 

It is another aspect to administer the nucleic acids of the present invention via gene 
therapy. 

15 It is yet another aspect of the present invention to provide protein chips or protein 

microarrays. 

It is yet another aspect of the present invention to provide a report in, for example 
paper, electronic and/or digital forms, concerning the protein-protein interactions, the 
modulating compounds and the like as well as a PIM®. 
20 These and other aspects are achieved by the present invention as evidenced by the 

summary of the invention, description of the preferred embodiments and the claims. 

Thus the present invention relates to a complex of interacting proteins of columns 1 
and 4 of Table 2. 

Furthermore, the present invention provides SID® polynucleotides and SID® 
25 polypeptides of Table 3, as well as a PIM® involved in transforming growth factor p-mediated 
disorders and/or diseases. 

The present invention also provides antibodies to the protein-protein complexes 
involved in transforming growth factor p-mediated disorders and/or diseases. 

In another embodiment the present invention provides a method for screening drugs for 
30 agents that modulate the protein-protein interactions and pharmaceutical compositions that 
are capable of modulating protein-protein interactions. 

In another embodiment the present invention provides protein chips or protein 
microarrays. 

In yet another embodiment the present invention provides a report in, for example, 
35 paper, electronic and/or digital forms. 

BRIEF DESCRIPTION OF THE DRAWINGS 
Fig. 1 is a schematic representation of the pB6 plasmid. 
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Fig. 2 is a schematic representation of the pB20 plasmid. 

Fig. 3 is a schematic representation of the pP6 plasmid. 

Fig. 4 is a schematic representation of vectors expressing the T25 fragment. 

Fig. 5 is a schematic representation of vectors expressing the T18 fragment. 
5 Fig. 6 is a schematic representation of various vectors of pCmAHLI , pT25 and pT18. 

Fig. 7 is a schematic representation identifying the SID®'s of proteins of the present 
invention. In this figure the "Full-length prey protein" is the Open Reading Frame (ORF) or 
coding sequence (CDS) where the identified prey polypeptides are included. The Selected 
Interaction Domain (SID®) is determined by the commonly shared polypeptide domain of 
10 every selected prey fragment. 

Fig. 8 is a protein map (PIM®). 

Fig. 9 is a schematic representation of the pB27 plasmid. 

Fig. 10 is a schematic representation of the pB28 plasmid. 

Fig. 1 1 is a schematic representation of a protein interaction map around the newly 

15 functionally characterized proteins described in the present invention. These 10 proteins are 
highlighted by the symbol The Predicted Biological Score (PBS) is represented by a 
code on each line and classified from A to E (Rain et a/. t 2001). PP1ca is also named 
PPP1CA. MADH5 and MADH6 correspond to Smad5 and Smad9, respectively. hMAD-2 and 
h-MAD-3 correspond to Smad2 and Smad3, respectively. MAN1 is the orthologous of SANE, 

20 a protein recently identified as involved in the BMP pathway (Raju et a/., 2002) 

Fig. 12 is a schematic representation of a protein interaction map between ZNF8 and 
Smad proteins. The full-length proteins are represented in grey and black boxes correspond 
to the interaction domains. Using two-hybrid screening, ZNF8 was shown to interact with 
Smadl (A), Smad4 (B), SmadS (C) and Smad9 (D). Amino-acid position are indicated. 

25 Fig. 13 A, B and C are graphs showing that ZNF8 siRNA represses TGFfJ- and BMP- 

dependent luciferase reporter activities. HepG2 cells were transiently transfected in 24 well- 
plates as described under Materials & Methods with the BMP reponsive luciferase reporter, 
p(GC) ir MLP-Luc (A & B) or the TGFp responsive luciferase reporter, p(GTCT)s-MLP-Luc 
(C). All experiments included pRL-TK as an internal transfection control. A T(3RI-targeting 

30 siRNA duplex was used as a positive control for disruption of the TGFP pathway. A mutated 
version of the TpRI-targeting siRNA duplex (2 mismatches versus consensus sequence) was 
used as a negative control. SiRNA transfections were performed at 4 and 40nM. Co- 
transfection of ZNF8-targeting siRNA duplex was tested in cells treated or not with 50ng/ml 
BMP7 (A), 50ng/ml BMP6 (B) or 5 ng/ml TGFpi (C) for 18 hours in cells pre-starved for 2 

35 hours in serum-free culture medium. Cells were harvested 48 hours after transfection and 
1 0pl of lysates were used for the Dual Luciferase Assay. Data are representative of two or 
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three independant duplicated experiments and are presented as a ratio between firefly and 
renilla luciferases. 

Fig. 14A, B and C are graphs showing that ZNF8 siRNA specifically represses BMP- 
dependent markers. HepG2 cells were transiently transfected in 24 well-plates as described 
5 under Materials & Methods with a control siRNA (pGL3-targeting siRNA) or ZNF8-targeting 
siRNA duplex. Cells were treated or not with 50ng/ml of recombinant human BMP7 for 18 
hours in cells pre-starved for 2 hours in serum-free culture medium. SiRNA transfections 
were performed either at O.SnM and 2.5nM (A & B) or at 4 and 40nM (C) of duplex. Cells 
were harvested and lysed 48 hours after transfection. Total RNA were extracted as 
10 described under Materials & Methods and quantitative PCR analysis were performed in order 
to quantitate the endogenous levels of the BMP pathway markers junB (A) and alkaline 
phosphatase (B& C). Data are representative of two or three independant duplicated 
experiments and are presented as normalized RNA levels using either GAPDH (A & B) or 
hGUS (C). 

15 Fig. 15 A and B are graphs showing that ZNF8 siRNA does not repress BMP- 

independent markers. HepG2 cells were transiently transfected in 24 well-plates as 
described under Materials & Methods with a control siRNA (pGL3-targeting siRNA) or ZNF8- 
targeting siRNA duplex. Cells were treated or not with 50ng/ml of recombinant human BMP7 
for 18 hours in cells pre-starved for 2 hours in serum-free culture medium. SiRNA 

20 transfections were performed either at 0.5nM and 2.5nM (A) or at 4 and 40nM (B) of duplex. 
Cells were harvested and lysed 48 hours after transfection. Total RNA were extracted as 
described under Materials & Methods and quantitative PCR analysis were performed in order 
to quantitate the endogenous levels of the TGFp pathway marker PAI-1 (PAM hereinafter 
Plasminogen Activator inhibitor I) (A) and an unrelated marker, hGUS (B). Data are 

25 representative of two or three independant duplicated experiments and are presented as 
normalized RNA levels using either GAPDH (A) or relative levels (B). 

Fig. 16 is a schematic representation of an Interaction between LAPTmS and Smurf2. 
The full-length proteins are represented in grey and black boxes correspond to the 
interaction domains. Using two-hybrid screening, interaction between Smurf2 and LAPTmS 

30 was found in both directions. Smurf2 was shown to interact with the C-terminal domain of 
LAPTmS. 

Fig. 17 A and B are graphs showing that LAPTmS specifically inhibits the TGFp 
pathway. The effect of LAPTmS over-expression was studied using the following Lucrferase 
reporter vectors: a TGFp responsive element (TGF-RE = p(GTCT) 8 -MLP-Luc), a BMP- 
35 responsive element (BMP-RE = p(GC) 12 -MLP-Luc) and an unrelated reporter (pGL3 control) 
(see Materials & Methods). The effect was studied in the presence or absence of TGFp (10 
ng/ml) or BMP7 (50 ng/ml), as described. This study was performed with 0, 2 or 10 ng of 



7 



WO 03/045990 



PCT/EP02/13866 



pV3-LAPTm5 in HepG2 cells (A) or with 0, 0.5, 2, 1 0 or 50 ng of pV3-LAPTm5 in HEK293 
cells (B). The specific Luciferase activity was normalized using the pRL-TK vector. 
Experiments were performed in triplicate. 

Fig. 18 A and B are graphs showing that LAPTmS expression is up-regulated by TGFp 
5 The endogenous level of LAPTmS mRNA was determined in several cell lines by Q-PCR 
experiments using the LAPTmS probe (see . Materials & Methods). Ct levels of LAPTmS 
mRNA is given for each cell lines (A). The endogenous level of mRNA was determined in 
HepG2 cells in the presence or absence of TGFp (1 0 ng/ml) with or without a TpRI-targeting 
siRNA duplex (B) (T(3RI hereinafter Transforming Growth Factor p Receptor I. 

10 Fig. 19 A and B are graphs showing that LAPTmS siRNA up-regulates BMP and TGFp- 

dependent reporter activities. HepG2 cells were transiently transfected in 24 well-plates as 
described under Materials & Methods with the TGFp reponsive luciferase reporter, 
p(GTCT) 8 -MLP-Luc (A) or the BMP responsive luciferase reporter, p(GC) ir MLP-Luc (B). All 
experiments included pRL-TK as an internal transfection control. A TpRI-targeting siRNA 

15 duplex was used as a positive control for disruption of the TGFp pathway. A mutated version 
of the TpRI-targeting siRNA duplex (2 mismatches versus consensus sequence) was used 
as a negative control. SiRNA transfections were performed at 4 and 40nM. Co-transfection of 
LAPTmS-targeting siRNA duplex was tested in cells treated or not with 5ng/ml recombinant 
human TGFp (A), 50ng/ml recombinant human BMP7 (B) for 18 hours in cells pre-starved for 

20 2 hours in serum-free culture medium. Cells were harvested 48 hours after transfection and 
10pl of lysates were used for the Dual Luciferase Assay. Data are representative of two or 
three independant duplicated experiments and are presented as a ratio between firefly and 
reniila luciferases. 

Fig. 20 A, B, C and D are graphs showing that LAPTmS siRNA up-regulates BMP and 
25 TGFp-dependent markers. HepG2 cells were transiently transfected in 24 well-plates as 
described under Materials & Methods with a control siRNA (pGL3-targeting siRNA) or 
LAPTm5-targeting siRNA duplex. Cells were treated or not with 5 ng/ml of recombinant 
human TGFpl or 50ng/ml of recombinant human BMP7 for 18 hours in cells pre-starved for 
2 hours in serum-free culture medium. SiRNA transfections were performed at 40nM of 
30 duplex (A, B t C & D). Cells were harvested and lysed 48 hours after transfection. Total RNA 
were extracted as described under Materials & Methods and quantitative PGR analysis were 
performed in order to quantitate the endogenous levels of the TGFp pathway markers PAM 
and junB (A & B, respectively) and a BMP pathway marker, alkaline phosphatase (C). Data 
are representative of two or three independant duplicated experiments and are presented as 
35 normalized RNA levels using hGUS (A, B & C). Relative levels of hGUS in the same 
experiment are also shown (D). 
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Fig. 21 is a schematic representation of an Interaction between RNF11 Smurfl, Smurf2 
and SARA. The full-length proteins are represented in grey and black boxes correspond to 
the interaction domains. Using two-hybrid screening, RNF11 was shown to interact with 
Smurfl (A), Smurf2 (B), and SARA (C). Amino-acid positions are indicated. 

5 Fig. 22 is a gel showing that RNF1 1 is involved in regulating SARA protein levels. 

Transfection experiments with pV3-SARA (200 ng) and/or pV3-RNF11 (300 ng) in the 
presence or absence of TGFp (10 ng/ml) were performed. After TGFp induction for 18H, 
cells' lysates were resolved on a 4-12% NuPAGE gradient gel, transfered and revealed using 
anti-SARA antibody (see Materials & Methods). 

io Fig. 23 is a schematic diagram showing the Interaction between KIAA1 196 and Smadl . 

The full-length proteins are represented in grey and black boxes correspond to the 
interaction domains. Using two-hybrid screening, KIAA1196 was shown to interact with 
Smadl. 

Fig. 24 A and B are graphs showing that KIAA1196 siRNA specifically represses 

15 TGFp-dependent markers in HepG2 cells.HepG2 cells were transiently transfected in 24 
well-plates as described under Materials & Methods with the TGFp responsive luciferase 
reporter, p(GTCT) 8 -MLP-t_uc (A) or the BMP reponsive luciferase reporter, p(GC) 12 -MLP-Luc 
(B). Ail experiments included pRL-TK as an internal transfection control. A TpRI-targeting 
siRNA duplex was used as a positive control for disruption of the TGFp pathway. A mutated 

20 version of the T Rl-targeting siRNA duplex (2 mismatches versus consensus sequence) 
was used as a negative control. SiRNA transfections were performed at 4 and 40nM. Co- 
transfection of KIAA1 196-targeting siRNA duplex was tested in cells treated or not with 
5ng/ml recombinant human TGFp (A) and 50ng/ml recombinant human BMP6 (B) for 18 
hours in cells pre-starved for 2 hours in serum-free culture medium. Cells were harvested 48 

25 hours after transfection and 10|jl of lysates were used for the Dual Luciferase Assay. Data 
are representative of two or three independant duplicated experiments and are presented as 
a ratio between firefly and renilla luciferases. 

Fig. 25 is a graph showing that KIAA1 196 siRNA specifically represses TGFp- 
dependent reporter activity in HEK293 cells. HEK 293 cells were transiently transfected in 24 

30 well-plates as described under Materials & Methods with the TGFp responsive luciferase 
reporter, p(GTCT) a -MLP-Luc. All experiments included pRL-TK as an internal transfection 
control. A TpRI-targeting siRNA duplex was used as a positive control for disruption of the 
TGFP pathway. A mutated version of the TpRI-targeting siRNA duplex (2 mismatches versus 
consensus sequence) was used as a negative control. SiRNA transfections were performed 

35 at 30nM. Co-transfection of KIAA1 196-targeting siRNA duplex was tested in cells treated or 
not with 5ng/ml recombinant human TGFp for 18 hours in cells pre-starved for 2 hours in 
serum-free culture medium. Cells were harvested 48 hours after transfection and 10pl of 
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lysates were used for the Dual Luciferase Assay. Data are representative of two or three 
independent duplicated experiments and are presented as a ratio between firefly and renilla 
luciferases. 

Fig. 26 A, B, C and D are graphs showing that KIAA1196 siRNA specifically represses 

5 TGFp-dependent markers. HepG2 cells were transiently transfected in 24 well-plates as 
described under Materials & Methods with a control siRNA (pGL3-targeting siRNA) or 
K1AA1 196-targeting siRNA duplex. Cells were treated or not with 5 ng/ml of recombinant 
human TGFpl or 50ng/ml of recombinant human BMP7 for 18 hours in cells pre-starved for 
2 hours in serum-free culture medium. SiRNA transfections were performed at 40nM of 

30 duplex (A, B f C & D). Cells were harvested and lysed 48 hours after transfection. Total RNA 
were extracted as described under Materials & Methods and quantitative PCR analysis were 
performed in order to quantitate the endogenous levels of the TGFp pathway markers PAM 
and junB (A & B, respectively) and a BMP pathway marker, alkaline phosphatase (C). Data 
are representative of two or three independant duplicated experiments and are presented as 

15 normalized RNA levels using hGUS (A, B & C). Relative levels of hGUS in the same 
experiment are also shown (D). 

Fig. 27 is a schematic representation showing the Interaction between LM04 and 
Smad9. The full-length proteins are represented in grey and black boxes correspond to the 
interaction domains. Using two-hybrid screening, LM04 was shown to interact with Smad9. 

20 Fig. 28 A, B and C are graphs showing that LM04 siRNA specifically repress a BMP- 

dependent luciferase reporter. HepG2 cells were transiently transfected in 24 well-plates as 
described under Materials & Methods with the BMP reponsive luciferase reporter, p(GC) 12r 
MLP-Luc (A) or the TGFp responsive luciferase reporter, p(GTCT) s -MLP-Luc (B). All 
experiments included pRL-TK as an internal transfection control. A TpRI-targeting siRNA 

25 duplex was used as a positive control for disruption of the TGF pathway. A mutated version 
of the TpRI-targeting siRNA duplex (2 mismatches versus consensus sequence) was used 
as a negative control. SiRNA transfections were performed at 4 and 40nM. Co-transfection of 
LM04-targeting siRNA duplex was tested in cells treated or not with 50ng/ml recombinant 
human BMP7 or BMP6 (A & B, respectively) and 5ng/ml recombinant human TGFp (C) for 

30 18 hours in cells pre-starved for 2 hours in serum-free culture medium. Cells were harvested 
48 hours after transfection and 10pl of lysates were used for the Dual Luciferase Assay. Data 
are. representative of two or three independant duplicated experiments and are presented as 
a ratio between firefly and renilla luciferases. 

Fig. 29 A and B are graphs showing that LM04 siRNA specifically represses BMP- 

35 induced markers in BMP7-treated HepG2 cells. HepG2 cells were transiently transfected in 
24 well-plates as described under Materials & Methods with a control siRNA (pGL3-targeting 
siRNA) or LM04-targeting siRNA duplex. Cells were treated or not with 50ng/ml of 
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recombinant human BMP7 for 18 hours in cells pre-starved for 2 hours in serum-free culture 
medium. SiRNA transfections were performed at 0.5 or 2.5nM of duplex (A) and 4 or 40nM of 
duplex (B). Cells were harvested and iysed 48 hours after transfection. Total RNA were 
extracted as described under Materials & Methods and quantitative PCR analysis were 

5 performed in order to quantitate the endogenous levels of the BMP pathway marker alkaline 
phosphatase (A & B). Data are representative of two or three independant duplicated 
experiments and are presented as normalized RNA levels using hGUS (A, B). 

Fig. 30 A, B and C are graphs showing that LM04 siRNA does not repress BMP- 
independent markers in BMP7-treated HepG2 cells.HepG2 cells were transiently transfected 

10 in 24 well-plates as described under Materials & Methods with a control siRNA (pGL3- 

targeting siRNA) or LM04-targeting siRNA duplex. Cells were treated or not with 50ng/ml of 
recombinant human BMP7 for 18 hours in cells pre-starved for 2 hours in serum-free culture 
medium. SiRNA transfections were performed at 4 or 40nM of duplex (A, B) and 0.5 or 
2.5nMof duplex (C). Cells were harvested and Iysed 48 hours after transfection. Total RNA 

15 were extracted as described under Materials & Methods and quantitative PCR analysis were 
performed in order to quantitate the endogenous levels of the TGFp and BMP pathways 
marker junB (A) and a TGFp pathway marker, PAI-1 (C). Data are representative of two or 
three independant duplicated experiments and are presented as normalized RNA levels 
using hGUS (A) or using GAPDH (C). Relative levels of hGUS in the same experiment are 

20 also shown (B). 

Fig. 31 is a schematic diagram showing the interaction between PP1ca and SARA. 
The full-length proteins are represented in grey and black boxes correspond to the 
interaction domains. Using two-hybrid screening, PP1ca was shown to interact with SARA. 
Fig. 32 A and B are graphs showing that PP1ca stimulates the TGFp pathway. 

25 The effect of PP1ca over-expression was studied using the following luciferase reporter 
vectors: a TGFp responsive element (TGF-RE = p(GTCT) 8 -MLP-Luc), a BMP-responsive 
element (BMP-RE = p(GC)i2-MLP-Luc) and an unrelated reporter (pGL3 control) (see 
Materials & Methods). The effect was studied in the presence or absence of TGFp (10 ng/mi) 
or BMP7 (50 ng/ml), as described. This study was performed with 0, 10, 50 or 200 ng of pV3- 

30 PP1ca in HepG2 cells (A) or in HEK293 cells (B). The specific Luciferase activity was 
normalized using the pRL-TK vector. Experiments were performed in triplicate. 

Fig. 33 A, B and C are graphs showing that PP1ca stimulates PAM mRNA expression. 
Baculoviruses containing the Smad3 or PP1ca genes under the control of the CMV promoter 
were generated and used to infect HepG2 cells (see Materials & Methods). The over- 

35 expression level was checked and quantified by Q-PCR (A). The endogenous PAM mRNA 
levels were measured by Q-PCR 24 hours post infection with Smad3 or PP1ca-containing 
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baculoviruses in the presence or absence of TGFp (10 ng/ml). The value 1 is attributed to the 
mRNA amount of PAM in the absence of TGF(3 and in the absence of infection (B). 

Fig. 34 is a schematic diagram showing the Interaction between HYPA and Smad4. 
The full-length proteins are represented in grey and black boxes correspond to the 
interaction domains. Using two-hybrid screening, HYPA was shown to interact with Smad4. 

Fig. 35 A, B and C are graphs showing that HYPA siRNA specifically represses BMP- 
dependent reporter activity. HepG2 cells were transiently transfected in 24 well-plates as 
described under Materials & Methods with the BMP reponsive luciferase reporter, p(GC) 12 - 
MLP-Luc (A & B) or the TGFP responsive luciferase reporter, p(GTCT) 8 -MLP-Luc (C). All 
experiments included pRL-TK as an internal transfection control. A TpRI-targeting siRNA 
duplex was used as a positive control for disruption of the TGFp pathway. A mutated version 
of the TPRI-targeting siRNA duplex (2 mismatches versus consensus sequence) was used 
as a negative control. SiRNA transfections were performed at 4 and 40nM. Co-transfection of 
HYPA-targeting siRNA duplex was tested in cells treated or not with 50ng/ml recombinant 
human BMP7 or BMP6 (A & B, respectively) and 5ng/ml recombinant human TGFp (C) for 
18 hours in cells pre-starved for 2 hours in serum-free culture medium. Cells were harvested 
48 hours after transfection and 10pl of lysates were used for the Dual Luciferase Assay. Data 
are representative of two or three independant duplicated experiments and are presented as 
a ratio between firefly and renilla luciferases. 

Fig. 36 is a graph showing that HYPA siRNA represses BMP-dependent markers. 
HepG2 cells were transiently transfected in 24 well-plates as described under Materials & 
Methods with a control siRNA (pGL3-targeting siRNA) or HYPA-targeting siRNA duplex. 
Cells were treated or not with 50ng/ml of recombinant human BMP7 for 18 hours in cells pre- 
starved for 2 hours in serum-free culture medium. SiRNA transfections were performed at 0.5 
or 2.5nM of duplex. Cells were harvested and lysed 48 hours after transfection. Total RNA 
were extracted as described under Materials & Methods and quantitative PCR analysis were 
performed in order to quantitate the endogenous levels of the BMP pathway marker alkaline 
phosphatase. Data are representative of two or three independant duplicated experiments 
and are presented as normalized RNA levels using GAPDH. 

Fig. 37 is a schematic diagram showing the Interaction between FLJ20037 and SARA. 
The full-length proteins are represented in grey and black boxes correspond to the 
interaction domains. Using two-hybrid screening, FLJ20037 was shown to interact with 
SARA. 

Fig. 38 A, B and C are graphs showing that FLJ20037 stimulates PAM mRNA 
expression. Baculoviruses containing the Smad3 or FLJ20037 genes under the control of the 
CMV promoter were generated and used to infect HepG2 cells (see Materials & Methods). 
The over-expression level was checked and quantified by Q-PCR (A). The endogenous PAI- 
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1 mRNA levels were measured by Q-PCR 24 hours post infectS^itfrSmkcltor FU20037- 
containing baculoviruses in the presence or absence of TGFp (10 ng/mL). The value 1 is 
attributed to the mRNA amount of PAM in the absence of TGFp and in the absence of 
infection (B). 

5 Fig. 39 is a graph showing that FU20037 siRNA down-regulates TGFp-dependent 

markers.HepG2 cells were transiently transfected in 24 well-plates as described under 
Materials & Methods with a control siRNA (pGL3-targeting siRNA) or FLJ20037-targeting 
siRNA duplex. Cells were treated or not with 5ng/ml of recombinant human TGFp for 18 
hours in cells pre-starved for 2 hours in serum-free culture medium. SiRNA transfections 

10 were performed at 0,5 or 2.5nM of duplex. Cells were harvested and lysed 48 hours after 
transfection. Total RNA was extracted as described under Materials & Methods and 
quantitative PCR analysis were performed in order to quantitate the endogenous levels of the 
TGFP pathway marker PAM. Data are representative of two or three independant duplicated 
experiments and are presented as normalized RNA levels using GAPDH. 

15 Fig. 40 is a schematic diagram showing the Interaction between PTPN12 and SmadS. 

The full-length proteins are represented in grey and black boxes correspond to the 
interaction domains. Using two-hybrid screening, PTPN12 was shown to interact with 
SmadS. Amino-acid positions are indicated. 

Fig. 41 A and B are graphs showing that PTPN12 siRNA up-regulates BMP and TGFp- 

20 dependent reporter activities. HepG2 cells were transiently transfected in 24 well-plates as 
described under Materials & Methods with the BMP reponsive luciferase reporter, p(GC) 12 - 
MLP-Luc (A) or the TGFp responsive luciferase reporter, p(GTCT) 8 -MLP-Luc (B). All 
experiments included pRL-TK as an internal transfection control. A TpRi-targeting siRNA 
duplex was used as a positive control for disruption of the TGF pathway. A mutated version 

25 of the TpRI-targeting siRNA duplex (2 mismatches versus consensus sequence) was used 
as a negative control. SiRNA transfections were performed at 4 and 40nM. Co-transfection of 
PTPN12-targeting siRNA duplex was tested in cells treated or not with 50ng/m! recombinant 
human BMP6 (A) and 5ng/ml recombinant human TGFp (B) for 18 hours in cells pre-starved 
for 2 hours in serum-free culture medium. Cells were harvested 48 hours after transfection 

30 and 1 Opl of lysates were used for the Dual Luciferase Assay. Data are representative of two 
or three Independant duplicated experiments and are presented as a ratio between firefly 
and renilla luciferases. 

Fig. 42 A and B are schematic diagrams showing the Interaction between HIPK3, SnoN 
and SNIP1. The full-length proteins are represented in grey and black boxes correspond to 

35 the interaction domains. Using two-hybrid screening, HIPK3 was shown to interact with the 
N-terminal domains of SNIP1 (A) and SnoN (B). Amino-acid positions are indicated. 
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Fig. 43 A and B are graphs showing that HIPK3 siRNA specifically up-regulates BMP- 
dependent reporter activities. 

HepG2 cells were transiently transfected in 24 well-plates as described under Materials & 
Methods with the BMP reponsive luciferase reporter, p(GC) 12 -MLP-Luc (A) or the TGFp 

5 responsive luciferase reporter, p(GTCT) 8 -MLP-Luc (B). All experiments included pRL-TK as 
an internal transfection control. A T Rl-targeting siRNA duplex was used as a positive 
control for disruption of the TGF pathway. A mutated version of the TpRI-targeting siRNA 
duplex (2 mismatches versus consensus sequence) was used as a negative control. SiRNA 
transfections were performed at 4 and 40nM. Co-transfection of HlPK3-targeting siRNA 

10 duplex was tested in cells treated or not with 50ng/ml recombinant human BMP6 (A) and 
5ng/mi recombinant human TGFp (B) for 18 hours in cells pre-starved for 2 hours in serum- 
free culture medium. Cells were harvested 48 hours after transfection and 10pl of lysates 
were used for the Dual Luciferase Assay. Data are representative of two or three 
independant duplicated experiments and are presented as a ratio between firefly and renilla 

15 luciferases. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

As used herein the terms "polynucleotides", "nucleic acids" and "oligonucleotides" are 
used interchangeably and include, but are not limited to RNA, DNA, RNA/DNA sequences of 
more than one nucleotide in either single chain or duplex form. The polynucleotide 

20 sequences of the present invention may be prepared from any known method including, but 
not limited to, any synthetic method, any recombinant method, any ex vivo generation 
method and the like, as well as combinations thereof. 

The term "polypeptide" means herein a polymer of amino acids having no specific 
length. Thus, peptides, oligopeptides and proteins are included in the definition of 

25 "polypeptide" and these terms are used interchangeably throughout the specification, as well 
as in the claims. The term "polypeptide" does not exclude post-translational modifications 
such as polypeptides having covalent attachment of glycosyl groups, acetyl groups, 
phosphate groups, lipid groups and the like. Also encompassed by this definition of 
"polypeptide" are homologs thereof. 

30 By the term "homologs" is meant structurally similar genes contained within a given 

species, orthologs are functionally equivalent genes from a given species or strain, as 
determined for example, in a standard complementation assay. Thus, a polypeptide of 
interest can be used not only as a model for identifying similiar genes in given strains, but 
also to identify homologs and orthologs of the polypeptide of interest in other species. The 

35 orthologs, for example, can also be identified in a conventional complementation assay. In 
addition or alternatively, such orthologs can be expected to exist in bacteria (or other kind of 



14 



WO 03/045990 



PCT/EP02/13866 



cells) in the same branch of the phylogenic tree, as set forth, for example, at 
ftp://ft pxme.msu.edu/pub/rdp/SSU-rRNA/SSU/Prok.phvlo . 

As used herein the term "prey polynucleotide" means a chimeric polynucleotide 
encoding a polypeptide comprising (i) a specific domain; and (ii) a polypeptide that is to be 
5 tested for interaction with a bait polypeptide. The specific domain is preferably a 
transcriptional activating domain. 

As used herein, a "bait polynucleotide" is a chimeric polynucleotide encoding a 
chimeric polypeptide comprising (i) a complementary domain; and (ii) a polypeptide that is to 
be tested for interaction with at least one prey polypeptide. The complementary domain is 
10 preferably a DNA-binding domain that recognizes a binding site that is further detected and is 
contained in the host organism. 

As used herein "complementary domain" is meant a functional constitution of the 
activity when bait and prey are interacting; for example, enzymatic activity. 

As used herein "specific domain" is meant a functional interacting activation domain 
15 that may work through different mechanisms by interacting directly or indirectly through 
intermediary proteins with RNA polymerase II or Ill-associated proteins in the vicinity of the 
transcription start site. 

As used herein the term "complementary" means that, for example, each base of a first 
polynucleotide is paired with the complementary base of a second polynucleotide whose 
20 orientation is reversed. The complementary bases are A and T (or A and U) or C and G. 

The term "sequence identity" refers to the identity between two peptides or between 
two nucleic acids. Identity between sequences can be determined by comparing a position 
in each of the sequences which may be aligned for the purposes of comparison. When a 
position in the compared sequences is occupied by the same base or amino acid, then the 
25 sequences are identical at that position. A degree of sequence identity between nucleic acid 
sequences is a function of the number of identical nucleotides at positions shared by these 
sequences. A degree of identity between amino acid sequences is a function of the number 
of identical amino acid sequences that are shared between these sequences. Since two 
polypeptides may each (i) comprise a sequence (i.e., a portion of a complete polynucleotide 
30 sequence) that is similar between two polynucleotides, and (ii) may further comprise a 
sequence that is divergent between two polynucleotides, sequence identity comparisons 
between two or more polynucleotides over a "comparison window" refers to the conceptual 
segment of at least 20 contiguous nucleotide positions wherein a polynucleotide sequence 
may be compared to a reference nucleotide sequence of at least 20 contiguous nucleotides 
35 and wherein the portion of the polynucleotide sequence in the comparison window may 
comprise additions or deletions (i.e., gaps) of 20 percent or less compared to the reference 
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sequence (which does not comprise additions or deletions) for optimal alignment of the two 
sequences. 

To determine the percent identity of two amino acids sequences or two nucleic acid 
sequences, the sequences are aligned for optimal comparison. For example, gaps can be 
5 introduced in the sequence of a first amino acid sequence or a first nucleic acid sequence for 
optimal alignment with the second amino acid sequence or second nucleic acid sequence. 
The amino acid residues or nucleotides at corresponding amino acid positions or nucleotide 
positions are then compared. When a position in the first sequence is occupied by the same 
amino acid residue or nucleotide as the corresponding position in the second sequence, the 
10 molecules are identical at that position. 

The percent identity between the two sequences is a function of the number of identical 
positions shared by the sequences. Hence % identity = number of identical positions / total 
number of overlapping positions X 100. 

In this comparison the sequences can be the same length or may be different in length. 
15 Optimal alignment of sequences for determining a comparison window may be conducted by 
the local homology algorithm of Smith and Waterman (J. Theor. Biol., 91 (2) pgs. 370-380 
(1981), by the homology alignment algorithm of Needleman and Wunsch, 1 MioL Biol., 48(3) 
pgs. 443-453 (1972), by the search for similarity via the method of Pearson and Lipman, 
PNAS, USA, 85(5) pgs. 2444-2448 (1988), by computerized implementations of these 
20 algorithms (GAP, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics Software 
Package Release 7.0, Genetic Computer Group, 575, Science Drive, Madison, Wisconsin) or 
by inspection. 

The best alignment (i.e., resulting in the highest percentage of identity over the 
comparison window) generated by the various methods is selected. 

25 The term "sequence identity" means that two polynucleotide sequences are identical 

(i.e., on a nucleotide by nucleotide basis) over the window of comparison. The term 
"percentage of sequence identity" is calculated by comparing two optimally aligned 
sequences over the window of comparison, determining the number of positions at which the 
identical nucleic acid base (e.g., A, T, C, G, U, or I) occurs in both sequences to yield the 

30 number of matched positions, dividing the number of matched positions by the total number 
of positions in the window of comparison (i.e., the window size) and multiplying the result by 
100 to yield the percentage of sequence identity. The same process can be applied to 
polypeptide sequences. 

The percentage of sequence identity of a nucleic acid sequence or an amino acid 

35 sequence can also be calculated using BLAST software (Version 2.06 of September 1998) 
with the default or user defined parameter. 
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The term "sequence similarity" means that amino acids can be modified while retaining 
the same function. It is known that amino acids are classified according to the nature of their 
side groups and some amino acids such as the basic amino acids can be interchanged for 
one another while their basic function is maintained. 

5 The term "isolated" as used herein means that a biological material such as a nucleic 

acid or protein has been removed from its original environment in which it is naturally 
present. For example, a polynucleotide present in a plant, mammal or animal is present in its 
natural state and is not considered to be isolated. The same polynucleotide separated from 
the adjacent nucleic acid sequences in which it is naturally inserted in the genome of the 

10 plant or animal is considered as being "isolated." 

The term "isolated" is not meant to exclude artificial or synthetic mixtures with other 
compounds, or the presence of impurities which do not interfere with the biological activity 
and which may be present, for example, due to incomplete purification, addition of stabilizers 
or mixtures with pharmaceutical^ acceptable excipients and the like. 

15 "Isolated polypeptide" or "isolated protein" as used herein means a polypeptide or 

protein which is substantially free of those compounds that are normally associated with the 
polypeptide or protein in a naturally state such as other proteins or polypeptides, nucleic 
acids, carbohydrates, lipids and the like. 

The term "purified" as used herein means at least one order of magnitude of 

20 purification is achieved, preferably two or three orders of magnitude, most preferably four or 
five orders of magnitude of purification of the starting material or of the natural material. 
Thus, the term "purified" as utilized herein does not mean that the material is 100% purified 
and thus excludes any other material. 

The term "variants" when referring to, for example, polynucleotides encoding a 

25 polypeptide variant of a given reference polypeptide are polynucleotides that differ from the 
reference polypeptide but generally maintain their functional characteristics of the reference 
polypeptide. A variant of a polynucleotide may be a naturally occurring allelic variant or it 
may be a variant that is known naturally not to occur. Such non-naturaily occurring variants 
of the reference polynucleotide can be made by, for example, mutagenesis techniques, 

30 including those mutagenesis techniques that are applied to polynucleotides, cells or 
organisms. 

Generally, differences are limited so that the nucleotide sequences of the reference 
and variant are closely similar overall and, in many regions identical. 

Variants of polynucleotides according to the present invention include, but are not 
35 limited to, nucleotide sequences which are at least 95% identical after alignment to the 
reference polynucleotide encoding the reference polypeptide. These variants can also have 
96%, 97%, 98% and 99.999% sequence identity to the reference polynucleotide. 
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Nucleotide changes present in a variant polynucleotide may be silent, which means 
that these changes do not alter the amino acid sequences encoded by the reference 
polynucleotide. 

Substitutions, additions and/or deletions can involve one or more nucleic acids. 
5 Alterations can produce conservative or non-conservative amino acid substitutions, deletions 
and/or additions. 

Variants of a prey or a SID® polypeptide encoded by a variant polynucleotide can 
possess a higher affinity of binding and/or a higher specificity of binding to its protein or 
polypeptide counterpart, against which it has been initially selected. In another context, 
10 variants can also loose their ability to bind to their protein or polypeptide counterpart. 

By "fragment of a polynucleotide" or "fragment of a SID® polynucleotide" is meant that 
fragments of these sequences have at least 12 consecutive nucleotides, or between 12 and 
5,000 consecutive nucleotides, or between 12 and 10,000 consecutive nucleotides, or 
between 12 and 20,000 consecutive nucleotides. 
15 By "fragment of a polypeptide" or "fragment of a SID® polypeptide" is meant that 

fragments of these sequences have at least 4 consecutive amino acids, or between 4 and 
1,700 consecutive amino adds, or between 4 and 3,300 consecutive amino acids, or 
between 4 and 6,600 consecutive amino acids. 

By "anabolic pathway" is meant a reaction or series of reactions in a metabolic pathway 
20 that synthesize complex molecules from simpler ones, usually requiring the input of energy. 
An anabolic pathway is the opposite of a catabolic pathway. 

As used herein, a "catabolic pathway" is a series of reactions in a metabolic pathway 
that break down complex compounds into simpler ones, usually releasing energy in the 
process. A catabolic pathway is the opposite of an anabolic pathway. 
25 As used herein, "drug metabolism" is meant the study of how drugs are processed and 

broken down by the body. Drug metabolism can involve the study of enzymes that break 
down drugs, the study of how different drugs interact within the body and how diet and other 
ingested compounds affect the way the body processes drugs. 

As used herein, "metabolism" means the sum of all of the enzyme-catalyzed reactions 
30 in living cells that transform organic molecules. 

By "secondary metabolism" is meant pathways producing specialized metabolic 
products that are not found in every cell. 

As used herein, "SID®" means a Selected Interacting Domain and is identified as 
follows: for each bait polypeptide screened, selected prey polypeptides are compared. 
35 Overlapping fragments in the same ORF or CDS define the selected interacting domain. 
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As used herein the term "PIM®" means a protein-protein interaction map. This map is 
obtained from data acquired from a number of separate screens using different bait 
polypeptides and is designed to map out all of the interactions between the polypeptides. 
The term "affinity of binding", as used herein, can be defined as the affinity constant Ka 
5 when a given SID® polypeptide of the present invention which binds to a polypeptide and is 
the following mathematical relationship: 

[SID®/po!ypeptide complex] 

Ka = 

[free SID®] [free polypeptide] 
10 wherein [free SID®], [free polypeptide] and [SID®/polypeptide complex] consist of the 

concentrations at equilibrium respectively of the free SID® polypeptide, of the free 
polypeptide onto which the SID® polypeptide binds and of the complex formed between 
SID® polypeptide and the polypeptide onto which said SID® polypeptide specifically binds. 
The affinity of a SID® polypeptide of the present invention or a variant thereof for its 
15 polypeptide counterpart can be assessed, for example, on a Biacore™ apparatus marketed 
by Amersham Pharmacia Biotech Company such as described by Szabo et a/. (Curr Opin 
Struct Biol 5 pgs. 699-705 (1995)) and by Edwards and Leartherbarrow (Anal. Biochem 246 
pgs. 1-6 (1997)). 

As used herein the phrase "at least the same affinity" with respect to the binding affinity 
20 between a SID® polypeptide of the present invention to another polypeptide means that the 
Ka is identical or can be at least two-fold, at least three-fold or at least five fold greater than 
the Ka value of reference. 

As used herein, the term "modulating compound" means a compound that inhibits or 
stimulates or can act on another protein which can inhibit or stimulate the protein-protein 
25 interaction of a complex of two polypeptides or the protein-protein interaction of two 
polypeptides. 

More specifically, the present invention comprises complexes of polypeptides or 
polynucleotides encoding the polypeptides composed of a bait polypeptide, or a bait 
polynucleotide encoding a bait polypeptide and a prey polypeptide or a prey polynucleotide 
30 encoding a prey polypeptide. The prey polypeptide or prey polynucleotide encoding the prey 
polypeptide is capable of interacting with a bait polypeptide of interest in various hybrid 
systems. 

As described in the background of the present invention, there are various methods 
known in the art to identify prey polypeptides that interact with bait polypeptides of interest 
35 These methods include, but are not limited to, generic two-hybrid systems as described by 
Fields et al. (Nature, 340:245-246 (1989)) and more specifically in U.S. Patent Nos. 
5,283,173, 5,468,614 and 5,667,973, which are hereby incorporated by reference; the 
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reverse two-hybrid system described by Vidal et al. (supra): the two plus one hybrid method 
described, for example, in Tirade et at. (supra)] the yeast forward and reverse 'n'-hybrid 
systems as described in Vidal and Legrain (supra); the method described in WO 99/42612; 
those methods described in Legrain et al (FEBS Letters 480 pgs. 32-36 (2000)) and the like. 
5 The present invention is not limited to the type of method utilized to detect protein- 

protein interactions and therefore any method known in the art and variants thereof can be 
used. It is however better to use the method described in W099/42612 or WO00/66722, 
both references incorporated herein by reference due to the methods' sensitivity, 
reproducibility and reliability. 

10 Protein-protein interactions can also be detected using complementation assays such 

as those described by Pelletier et al. at http://\v\vw.abrf.org/JBT/Articles/JBT0012/ibt0012.htmI . 
WO 00/07038 and WO98/34120. 

Although the above methods are described for applications in the yeast system, the 
present invention is not limited to detecting protein-protein interactions using yeast, but also 

15 includes similar methods that can be used in detecting protein-protein interactions in, for 
example, mammalian systems as described, for example in Takacs et al. (Proc. Natl. Acad. 
ScL, USA, 90 (21):10375-79 (1993)) and Vasavada et al. (Proc. Natl. Acad. ScL, USA, 88 
(23):10686-90 (1991)), as well as a bacterial two-hybrid system as described in Karimova et 
al. (1998), W099/28746, WO00/66722 and Legrain et al. (FEBS Letters, 480 pgs. 32-36 

20 (2000)). 

The above-described methods are limited to the use of yeast, mammalian cells and 
Escherichia coli cells, the present invention is not limited in this manner. Consequently, 
mammalian and typically human cells, as well as bacterial, yeast, fungus, insect, nematode 
and plant cells are encompassed by the present invention and may be transfected by the 
25 nucleic acid or recombinant vector as defined herein. 

Examples of suitable cells include, but are not limited to, VERO cells, HELA cells such 
as ATCC No. CCL2, CHO cell lines such as ATCC No. CCL61, COS cells such as COS-7 
cells and ATCC No. CRL 1650 cells, W138, BHK, HepG2, 3T3 such as ATCC No. CRL6361, 
A549, PC12, K562 cells, 293 cells, Sf9 cells such as ATCC No. CRL1711 and Cv1 cells such 
30 as ATCC No. CCL70. 

Other suitable cells that can be used in the present invention include, but are not 
limited to, prokaryotic host cells strains such as Escherichia coli, (e.g., strain DH5-a), Bacillus 
subtilis, Salmonella typhimurium, or strains of the genera of Pseudomonas t Streptomyces 
and Staphylococcus. 

35 Further suitable cells that can be used in the present invention include yeast cells such 

as those of Saccharomyces such as Saccharomyces cerevisiae. 
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The bait polynucleotide, as well as the prey polynucleotide can be prepared according 
to the methods known in the art such as those described above in the publications and 
patents reciting the known method perse. 

The bait and the prey polynucleotide of the present invention is obtained from 
5 transforming growth factor p cDNA, or variants of cDNA fragment from a library of 
' transforming growth factor p, and fragments from the genome or transcriptome of 
transforming growth factor p cDNA ranging from about 12 to about 5,000, or about 12 to 
about 10,000 or from about 12 to about 20,000. The prey polynucleotide is then selected, 
sequenced and identified. 
10 A transforming growth factor p super-family of cytokines prey library is prepared from 

the transforming growth factor p cDNA and constructed in the specially designed prey vector 
pP6 as shown in Figure 3 after ligation of suitable linkers such that every cDNA insert is 
fused to a nucleotide sequence in the vector that encodes the transcription activation domain 
of a reporter gene. Any transcription activation domain can be used in the present invention. 
15 Examples include, but are not limited to, Gal4,YP16, B42, His and the like. Toxic reporter 
genes, such as CAT*, CYH2, CYH1, URA3, bacterial and fungi toxins and the like can be 
used in reverse two-hybrid systems. 

The polypeptides encoded by the nucleotide inserts of the transforming growth factor p 
prey library thus prepared are termed "prey polypeptides" in the context of the presently 
20 described selection method of the prey polynucleotides. 

The bait polynucleotides can be inserted in bait plasmid pB27 or pB28 as illustrated in 
Figure 8 and Figure 9. The bait polynucleotide insert is fused to a polynucleotide encoding 
the binding domain of, for example, the Gal4 DNA binding domain and the shuttle expression 
vector is used to transform cells. 
25 The bait polynucleotides used in the present invention are described in Table 1 . 

As stated above, any cells can be utilized in transforming the bait and prey 
polynucleotides of the present invention including mammalian cells, bacterial cells, yeast 
cells, insect cells and the like. 

In an embodiment, the present invention identifies protein-protein interactions in yeast. 
30 In using known methods a prey positive clone is identified containing a vector which 
comprises a nucleic acid insert encoding a prey polypeptide which binds to a bait polypeptide 
of interest The method in which protein-protein interactions are identified comprises the 
following steps: 

i) mating at least one first haploid recombinant yeast cell clone from a recombinant 
35 yeast cell clone library that has been transformed with a plasmid containing the prey 
polynucleotide to be assayed with a second haploid recombinant yeast cell clone 
transformed with a plasmid containing a bait polynucleotide encoding for the bait polypeptide; 
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ii) cultivating diploid cell clones obtained in step i) on a selective medium; and 

iii) selecting recombinant cell clones which grow on the selective medium. 
This method may further comprise the step of: 

iv) characterizing the prey polynucleotide contained in each recombinant cell clone 
5 which is selected in step iii). 

In yet another embodiment of the present invention, in lieu of yeast, Escherichia coli is 
used in a bacterial two-hybrid system, which encompasses a similar principle to that 
described above for yeast, but does not involve mating for characterizing the prey 
polynucleotide. 

10 In yet another embodiment of the present invention, mammalian cells and a method 

similar to that described above for yeast for characterizing the prey polynucleotide are used. 

By performing the yeast, bacterial or mammalian two-hybrid system, it is possible to 
identify for one particular bait an interacting prey polypeptide. The prey polynucleotide that 
has been selected by testing the library of preys in a screen using the two-hybrid, two plus 

15 one hybrid methods and the like, encodes the polypeptide interacting with the protein of 
interest 

The present invention is also directed, in a general aspect, to a complex of 
polypeptides, polynucleotides encoding the polypeptides composed of a bait polypeptide or 
bait polynucleotide encoding the bait polypeptide and a prey polypeptide or prey 

20 polynucleotide encoding the prey polypeptide capable of interacting with the bait polypeptide 
of interest. These complexes are identified in Table 2. 

In another aspect, the present invention relates to a complex of polynucleotides 
consisting of a first polynucleotide, or a fragment thereof, encoding a prey polypeptide that 
interacts with a bait polypeptide and a second polynucleotide or a fragment thereof. This 

25 fragment has at least 12 consecutive nucleotides, but can have between 12 and 5,000 
consecutive nucleotides, or between 12 and 10,000 consecutive nucleotides or between 12 
and 20,000 consecutive nucleotides. 

The complexes of the two interacting polypeptides listed in Table 2 and the sets of two 
polynucleotides encoding these polypeptides also form part of the present invention. 

30 In yet another embodiment, the present invention relates to an isolated complex of at 

least two polypeptides encoded by two polynucleotides wherein said two polypeptides are 
associated in the complex by affinity binding and are depicted in columns 1 and 4 of Table 2. 

In yet another embodiment, the present invention relates to an isolated complex 
comprising at least a polypeptide as described in column 1 of Table 2 and a polypeptide as 

35 described in column 4 of Table 2. The present invention is not limited to these polypeptide 
complexes alone but also includes the isolated complex of the two polypeptides in which 
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fragments and/or homologous polypeptides exhibit at least 95% sequence identity, as well as 
from 96% sequence identity to 99.999% sequence identity. 

Also encompassed in another embodiment of the present invention is an isolated 
complex in which the SID® of the prey polypeptides encoded by SEQ ID N°27 to 64 in Table 
5 3 form the isolated complex. 

Besides the isolated complexes described above, nucleic acids coding for a Selected 
Interacting Domain (SID®) polypeptide or a variant thereof or any of the nucleic acids set 
forth in Table 3 can be inserted into an expression vector which contains the necessary 
elements for the transcription and translation of the inserted protein-coding sequence. Such 
10 transcription elements include a regulatory region and a promoter. Thus, the nucleic acid 
which may encode a marker compound of the present invention is operably linked to a 
promoter in the expression vector. The expression vector may also include a replication 
origin. 

A wide variety of host/expression vector combinations are employed in expressing the 
15 nucleic acids of the present invention. Useful expression vectors that can be used include, 
for example, segments of chromosomal, non-chromosomal and synthetic DNA sequences. 
Suitable vectors include, but are not limited to, derivatives of SV40 and pcDNA and known 
bacterial plasmids such as col EI, pCR1, pBR322, pMal-C2, pET, pGEX as described by 
Smith et al (1988), pMB9 and derivatives thereof, plasmids such as RP4 f phage DNAs such 
20 as the numerous derivatives of phage I such as NM989, as well as other phage DNA such as 
M13 and filamentous single stranded phage DNA; yeast plasmids such as the 2 micron 
plasmid or derivatives of the 2m plasmid, as well as centomeric and integrative yeast shuttle 
vectors; vectors useful in eukaryotic cells such as vectors useful in insect or mammalian 
cells; vectors derived from combinations of plasmids and phage DNAs, such as plasmids that 
25 have been modified to employ phage DNA or the expression control sequences; and the like. 

For example in a baculovirus expression system, both non-fusion transfer vectors, such 
as, but not limited to pVL941 (BamHI cloning site Summers), pVL1393 (BamHI, Smal, Xba\, 
EcoRl Notl XrnaWl, BgUl and Psfl cloning sites; Invitrogen), pVL1392 (flglll, Psfl, Not\, 
Xmalll, EcoRl Xba/I, Smal and BamHI cloning site; Summers and Invitrogen) and 
30 pBlueBaclll (BamHI, Bg/ll, Psfl, Ncol and Hind\\\ cloning site, with blue/white recombinant 
screening, Invitrogen), and fusion transfer vectors such as, but not limited to, pAc700 (BamHI 
and Kpn\ cloning sites, in which the BamHI recognition site begins with the initiation codon; 
Summers), pAc701 and pAc70-2 (same as pAc700, with different reading frames), pAc360 
(BamHI cloning site 36 base pairs downstream of a polyhedrin initiation codon; Invitrogen 
35 (1995)) and pBlueBacHisA, B, C (three different reading frames with BamHI, Bg/ll, Psfl, A/col 
and HindW cloning site, an N-terminal peptide for ProBond purification and blue/white 
recombinant screening of plaques; Invitrogen (220) can be used. 
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Mammalian expression vectors contemplated for use in the invention include vectors 
with inducible promoters, such as the dihydrofolate reductase promoters, any expression 
vector with a DHFR expression cassette or a DHFR/methotrexate co-amplification vector 
such as pED (Psfl, Sa/I, Sbal, Smal and EcoRI cloning sites, with the vector expressing both 
5 the cloned gene and DHFR; Kaufman, 1991). Alternatively a glutamine 
synthetase/methionine sulfoximine co-amplification vector, such as pEE14 (H/ndlll, Xball, 
Smal, Sbal, EcoRI and Bc/I cloning sites in which the vector expresses glutamine synthetase 
and the cloned gene; Celltech). A vector that directs episomal expression under the control 
of the Epstein Barr Virus (EBV) or nuclear antigen (EBNA) can be used such as pREP4 

10 (BamHl, Sfil, Xho\, Not\, Nhe\, H/ndlll, Nhel, PvuU and Kpnl cloning sites, constitutive RSV- 
LTR promoter, hygromycin selectable marker; Invitrogen), pCEP4 (BamHl, Sfil, Xho\ t Not\, 
Nhel, HindlW, Nhel PvuW and Kpnl cloning sites, constitutive hCMV immediate early gene 
promoter, hygromycin selectable marker; Invitrogen), pMEP4 (Kpn\, Pvul, Nhel, H/ndlll, Not\, 
Xhol Sfil, BamHl cloning sites, inducible methallothionein Ha gene promoter, hygromycin 

15 selectable marker, Invitrogen), pREP8 (BamHl, Xhol, Not\ t HindlW, Nhel and Kpnl cloning 
sites, RSV-LTR promoter, histidinol selectable marker; Invitrogen), pREP9 (Kpnl, Nhel 
HindlW, Not\, Xho\, Sfil, BamHl cloning sites, RSV-LTR promoter, G418 selectable marker; 
Invitrogen), and pEBVHis (RSV-LTR promoter, hygromycin selectable marker, N-terminal 
peptide purifiable via ProBond resin and cleaved by enterokinase; Invitrogen). 

20 Selectable mammalian expression vectors for use in the invention include, but are not 

limited to, pRc/CMV (H/ndlll, BsfXI, A/o/l, Sbal and Apa\ cloning sites, G418 selection, 
Invitrogen), pRc/RSV (HindU, Spel, BsfXI, Noil, Xbal cloning sites, G418 selection, 
invitrogen) and the like. Vaccinia virus mammalian expression vectors (see, for example 
Kaufman 1991 that can be used in the present invention include, but are not limited to, 

25 pSC11 (Smal cloning site, TK- and p-gal selection), pMJ601 (Sa/I, Smal, Afft, A/art, BspMII, 
BamHl, Apal, Nhel Sacll, Kpnl and H/ndlll cloning sites; TK- and p-gal selection), 
pTKgptFIS (EcoRI, Pstl, Sa/ll, Accl, H/ndll, Sbal, BamHl and Hpa cloning sites, TK or XPRT 
selection) and the like. 

Yeast expression systems that can also be used in the present include, but are not 

30 limited to, the non-fusion pYES2 vector (Xbal, Sphl, Shol, Noil, GsfXI, EcoRI, Bs/XI, BamHl, 
Sad, Kpnl and HindWl cloning sites, Invitrogen), the fusion pYESHisA, B, C (Xfaa/I, Sphl 
Shol, Nott, BstXl, EcoRI, BamHl, Sad, Kpnl and H/ndlll cloning sites, N-terminal peptide 
purified with ProBond resin and cleaved with enterokinase; Invitrogen), pRS vectors and the 
like. 

35 
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CLAIMS 

What is claimed is: 

1. A complex between two interacting proteins as defined in columns 1 and 4 in 
Table 2. 

2. A complex between two polynucleotides encoding for the polypeptides of claim 
1. 

3. A recombinant host cell expressing the interacting polypeptides of said 
complex of protein-protein interaction of claim 1. 

4. Use of a SID®, an interaction or a prey to screen molecules that inhibit TGFp 
or inhibit a TGFp super-family of cytokines pathway. 

5. A molecule that inhibits inhibit TGFp or inhibits a TGFP super-family of 
cytokines pathway. 

6. Use according to Claim 4, wherein said screening occurs in mammalian cells 
or yeast cells. 

7. A SID® polypeptide comprising the SEQ ID No 63 to 98. 

8. A SID® polynucleotide comprising the SEQ ID No 27 to 62. 

9. A vector comprising the SID® polynucleotide comprising the SEQ ID No 27 to 
62. 

10. A fragment of said SID® polypeptide according to Claim 7. 

11. A variant of said SID® polypeptide according to Claim 7. 

12. A fragment of said SID® polynucleotide according to Claim 8. 

13. A variant of said SID® polynucleotide according to Claim 8. 

14. A vector comprising the SID® polynucleotide according to any one of Claims 8, 
12 or 13. 

15. A recombinant host cell containing the vectors according to Claim 14. 

16. A pharmaceutical composition comprising the molecule of claim 5 and a 
pharmaceutical^ acceptable carrier. 

17. A pharmaceutical composition comprising a SID® polypeptide SEQ ID No 63 
to 98 and a pharmaceutical^ acceptable carrier. 

18. A pharmaceutical composition comprising the recombinant host cells of Claim 
15 and a pharmaceutical^ acceptable carrier. 

19. A protein chip comprising the polypeptides of Table 2. 

20. Use of a ZNF8 protein for the preparation of a medicament for treating 
diseases and /or disorders linked or involving a TGFp super-family of cytokines. 
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21. Use of a LAPTmS protein for the preparation of a medicament for treating 
diseases and /or disorders linked or involving a TGFp super-family of cytokines. 

22. Use of a RNF11 protein for the preparation of a medicament for treating 
diseases and /or disorders linked or involving a TGFp super-family of cytokines. 

23. Use of a LM04 protein for the preparation of a medicament for treating 
prostate cancer. 

24. Use of a PPC1 protein for the preparation of a medicament for treating 
diseases and /or disorders linked or involving a TGFp super-family of cytokines. 

25. Use of an HYPA protein for the preparation of a medicament for treating 

diseases and /or disorders linked or involving a TGFp super-family of cytokines. 

26. Use of a PTP protein for the preparation of a medicament for treating diseases and 
/or disorders linked or involving a TGFp super-family of cytokines. 

27. Use of an HYPK3 protein for the preparation of a medicament for treating diseases 
and /or disorders linked or involving a TGFp super-family of cytokines. 

28. Use of a KIAA1196 protein for the preparation of a medicament for treating 
diseases and /or disorders linked or involving a TGFp super-family of cytokines. 

29. Use of a FL20037 protein for the preparation of a medicament for treating 
diseases and /or disorders linked or involving a TGFp super-family of cytokines. 

30. Use of a complex between two interacting proteins as defined in columns 1 and 
4 in table 2 to screen I molecules for diagnosis or treating transforming growth factor p 
disorders and/or diseases. 
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